Skip to content

fix: replace copy with deepcopy in Mooncake pullbacks #723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Feb 9, 2025
Merged

Conversation

gdalle
Copy link
Member

@gdalle gdalle commented Feb 9, 2025

  • Replace copy with deepcopy when returning general tangents to fix How to use Mooncake in Lux chalk-lab/Mooncake.jl#467. Add a shortcut for numbers and arrays to avoid paying the price of deepcopy.
  • Use Mooncake.value_and_gradient!! for DI.gradient. The same reasoning applies for copying before returning.
  • Add type annotations on f::F everywhere to force specialization.

@willtebbutt does this look good?

Copy link

codecov bot commented Feb 9, 2025

Codecov Report

Attention: Patch coverage is 96.15385% with 1 line in your changes missing coverage. Please review.

Project coverage is 97.98%. Comparing base (08f09b4) to head (6778b17).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...MooncakeExt/DifferentiationInterfaceMooncakeExt.jl 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #723      +/-   ##
==========================================
- Coverage   97.99%   97.98%   -0.01%     
==========================================
  Files         121      121              
  Lines        6341     6363      +22     
==========================================
+ Hits         6214     6235      +21     
- Misses        127      128       +1     
Flag Coverage Δ
DI 98.94% <96.15%> (-0.02%) ⬇️
DIT 95.92% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely LGTM. Would it make sense to add a regression test which checks that the new copying mechanism works for parameters which are e.g. Tuples / NamedTuples ?

@gdalle
Copy link
Member Author

gdalle commented Feb 9, 2025

I would like to ensure the absence of regression more systematically by using test scenarios which involve tuples or nested structs. One option is to define new ones (#724). Another option would be to implement the necessary standardization so that Mooncake can be tested on Flux and Lux scenarios (no idea how complicated that would be). A third option would be to simply check that Mooncake runs on these neural net scenarios, without ensuring correctness because comparison to the reference gradient requires standardization. Thoughts?

@willtebbutt
Copy link
Member

Hmmm interesting. Is it documented anywhere what standardisations must be applied in order to support Flux / Lux? Additionally, do the two frameworks agree? I'd be keen to get Mooncake tested on them as part of DI / Mooncake's own tests.

I don't want to slow down this PR though, so perhaps just merge this, and we can discuss elsewhere?

@gdalle
Copy link
Member Author

gdalle commented Feb 9, 2025

Not documented, but these standardizations are necessary to compare the gradient returned by different backends. But I've never even tried Mooncake on such scenarios so maybe it does the right thing / agrees with Zygote out of the box. Let's merge this and discuss further in #724

@gdalle gdalle merged commit 4e694bb into main Feb 9, 2025
48 of 50 checks passed
@gdalle gdalle deleted the gd/mooncake_copy branch February 10, 2025 00:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to use Mooncake in Lux
2 participants